Improved Dyna-Q: A Reinforcement Learning Method Focused via Heuristic Graph for AGV Path Planning in Dynamic Environments
نویسندگان
چکیده
Dyna-Q is a reinforcement learning method widely used in AGV path planning. However, large complex dynamic environments, due to the sparse reward function of and searching space, this has problems low search efficiency, slow convergence speed, even inability converge, which seriously reduces performance practicability it. To solve these problems, paper proposes an Improved algorithm for planning environments. First, problem global guidance mechanism based on heuristic graph, can effectively reduce space and, thus, improve efficiency obtaining optimal path. Second, Dyna-Q, novel action selection provide more intensive feedback efficient decision planning, improving algorithm. We evaluated our approach scenarios with static obstacles obstacles. The experimental results show that proposed obtain better paths efficiently than other reinforcement-learning-based methods including classical Q-Learning algorithms.
منابع مشابه
Wp-dyna: Planning and Reinforcement Learning in Well-plannable Environments
Reinforcement learning (RL) involves sequential decision making in uncertain environments. The aim of the decision-making agent is to maximize the benefit of acting in its environment over an extended period of time. Finding an optimal policy in RL may be very slow. To speed up learning, one often used solution is the integration of planning, for example, Sutton’s Dyna algorithm, or various oth...
متن کاملDyna-H: a heuristic planning reinforcement learning algorithm applied to role-playing-game strategy decision systems
In a Role-Playing Game, finding optimal trajectories is one of the most important tasks. In fact, the strategy decision system becomes a key component of a game engine. Determining the way in which decisions are taken (online, batch or simulated) and the consumed resources in decision making (e.g. execution time, memory) will influence, in mayor degree, the game performance. When classical sear...
متن کاملThe Z Method for Fast Path Planning in Dynamic Environments
We present a method to plan collision free paths for robots with any number of degrees of freedom in dynamic environments The method proved to be very e cient as it ommits a complete representation of the high di mensional search space Its complexity is linear in the number of degrees of freedom A preprocessing of the geometry data of the robot or the environment is not re quired With the time ...
متن کاملPath Planning in Dynamic Environments
The motion planning problem for mobile robots is typically formulated as follows: given a robot and a description of an environment, plan a path of the robot between two specified locations, which is collision-free and satisfies certain optimization criteria. Traditionally there are two approaches to the problem: Off-line planning, which assumes perfectly known and stable environment, and on-li...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Drones
سال: 2022
ISSN: ['2504-446X']
DOI: https://doi.org/10.3390/drones6110365